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Abstract. The Recoil Growth algorithm, proposed in 1999 by Consta et al, 
is one of the most efficient algorithm available in the literature to sample from 
a multi-polymer system. Such problems are closely related to the generation 
of self-avoiding paths. In this paper, we study a variant of the original Recoil 
Growth algorithm, where we constrain the generation of a new polymer to 
take place on a specific class of graphs. This makes it possible to make a fine 
trade-off between computational cost and success rate. We moreover give a 
simple proof for a lower bound on the irreducibility of this new algorithm, 
which applies to the original algorithm as well. 
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1. Introduction 

Designing an algorithm that efficiently samples from a multi-polymer system 
according to a given probability distribution is the focus of much research activity 
in chemical physics [2, 10]. The state of a multi-polymer system being a collection 
of self-avoiding paths that do not overlap each other, this problem is closely related 
to the classical problem in computer science of generating self-avoiding paths. The 
state space of such systems is huge, especially in high dimension or if the paths 
are long, which makes these problems hard. The main issue consists of defining 
an algorithm that both keeps the computational cost low and converges rapidly 
to the sampling distribution. For multi-polymer systems, various approaches have 
been suggested to tackle this problem, among which is the Recoil Growth (RG) 
algorithm, meant to be one of the most efficient algorithm currently available in 
the literature. For a precise description of this algorithm, the reader is referred 
to [3] 1 . In the present paper (which is an extended version of [12]), we define and 
analyze a variant of the RG algorithm, to which we refer as the RG* algorithm. 
The main ideas of both the RG and the RG* algorithms are to be found in two 
important classes of algorithms, which we briefly describe below: the Metropolis 
algorithm [9] and the auxiliary variable method [1, 7]. 
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The Metropolis algorithm. The Metropolis algorithm is a very generic algo- 
rithm that approximately samples according to a probability distribution tt defined 
on some finite state space X. It takes as other input a Markov Chain P, and 
constructs a reversible Markov Chain with tt as stationary distribution. In the 
long-run, we can therefore sample from a distribution that is arbitrarily close to 
tt. The implicit idea is that it is hard to sample from tt, whereas it is easy to 
generate P, for instance by local modifications of a state. The new Markov Chain 
is built thanks to the following rejection procedure, in which we use the notation 
A(x, y) = n(x)P(x, y). Starting from x € X, choose as candidate for the next state 
some y G X with probability P(x, y). If A(y, x) > A(x, y), then go to y. Otherwise, 
flip a coin with bias A(y, x)/A(x, y): if tails, go to y, and if heads, stay in x. By do- 
ing so, the probability of going from x to y ^ x is P(x,y) • min (l, A(y, x)/A(x, y)) , 
and it is easy to see that this Markov chain has tt as reversible distribution. An 
important remark is that for the rejection procedure, only the ratio A(y, x)/A(x, y) 
matters. In particular, it is sufficient to know tt up to its normalizing constant, 
what is very useful in many concrete applications. For instance, one could wish to 
sample complex combinatorial objects uniformly: in this case, one does not need 
to know the cardinality of these objects. 

Under a more global look, the Markov Chain P can be seen as an exogenous 
process that, at each step of the algorithm, proposes a candidate for the next step. 
In general, the stationary distribution of P is not tt, and so the role of the rejection 
procedure aims at compensating for this bias: by suitably rejecting or accepting this 
candidate, the new Markov Chain can be shown to be reversible with tt as stationary 
distribution. Reversibility is an important feature here, because it makes it very 
easy to show that tt is indeed the stationary distribution. Otherwise, since the 
objects under consideration are usually fairly complex, proving stationarity could 
be very challenging. 

The auxiliary variables method. The auxiliary variable method is a concep- 
tually easy extension of the Metropolis algorithm: instead of sampling from X 
according to tt, it is sometimes easier to sample from a pair X x y according to 
some distribution 7r(-, •) which has tt as marginal. In such cases, the idea is to 
apply the Metropolis algorithm to tt, and then to recover tt by summation. The 
new variable y G y is referred to as the "auxiliary variable" . As we will see, the 
concept of underlying graph in the RG* algorithm is very close to an auxiliary vari- 
able; underlying graphs are central in the RG* algorithm, and are introduced not 
because they make the sampling easier or more natural, but because they reduce 
the computational cost. 

The RG* algorithm. The state space of the RG* algorithm is the set of all 
possible configurations of non-intersecting polymers in some dimension d. More 
precisely, there is an underlying <i-dimcnsional grid, endowed with a cyclic struc- 
ture on the edges, on which all the objects are considered. A polymer is then just 
an undirected self-avoiding path of given length, a path being a sequence of neigh- 
boring vertices on the underlying grid. The fact that polymers cannot intersect 
each other, and that they cannot intersect themselves, comes from a very natural 
physical constraint, namely that a point in space cannot be occupied by two dif- 
ferent particles. Figure 1 shows an example of a two-dimensional multi-polymer 
system on a torus with 100 polymers of length 25 each. 
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One step of this algorithm consists in removing one polymer at random, and 
trying to re-grow it elsewhere. The growth may not be successful, in which case 
we put back the original polymer in the system, and we remain in the same state. 
But if we successfully grew a candidate polymer, by analogy with the Metropolis 
algorithm, we flip a coin to decide whether to add this polymer or to put back the 
original one. The goal of this coin-tossing is to compensate for the bias introduced 
during the growth of the candidate polymer in order to yield the right stationary 
distribution. Though it does not clearly appear at this point, we want to stress 
the fact that computing this probability of acceptance will require a fair amount 
of work. This is in sharp contrast with the Metropolis algorithm, for which it is 
straightforward to derive this quantity. 

One of the main difficulties of the RG* algorithm therefore lies in the growth of 
the new candidate polymer. Since polymers are self-avoiding paths, this problem 
is closely related to the generation of self-avoiding walks, for which an important 
amount of research has already been carried out [8, 13]. Two differences make the 
situation slightly more complex in our case. First, we have several polymers here, 
with the constraint that they cannot intersect each other. Second, each step of 
the RG* algorithm involves the attempt to grow a new polymer. In order for the 
algorithm to perform well, this growth needs to be done fast, even if it means that 
it fails often. 

The efficiency of the RG* algorithm, and hence its interest compared to other 
algorithms, can be assessed by its ability to address a number of issues that clearly 
emerge from the above global description. 
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First, the RG* algorithm must generate a state S with probability arbitrarily 
close to q(S), where q is the distribution we want to sample from. We have seen 
that this will be done by constructing a Markov chain having q as the stationary 
distribution. As in the Metropolis algorithm, once we have a candidate polymer, 
we are free to choose the probability of accepting it; we prove in this paper that 
the chain will be reversible for a suitable choice of the probability of acceptance. 
By doing so, we find that the probability of acceptance suggested in [3] is the right 
one. However, the same probability of acceptance fails to yield the right stationary 
distribution in some extended version of the algorithm, and this subtle mistake was 
not detected in the original paper. 

Second, the algorithm must mix as quickly as possible, which means that station- 
arity has to be reached as fast as possible. Even on simple examples, determining 
the convergence rate of a Metropolis type algorithm is a very hard question, as 
is shown in [4]. Nevertheless, a rapidly mixing Markov chain requires both high 
chain construction and high acceptance rates. The chain construction rate refers 
to the probability that the growth of a new candidate polymer is successful, while 
the acceptance rate refers to the probability that, given a candidate polymer, it is 
accepted. We see from the above description of the RG* algorithm that if one of 
these two rates is low, the algorithm will stay for a long time in the same state 
before changing, so that it is very unlikely that it mixes rapidly. 

These two quantities intrinsically depend on the generation of the new candidate 
polymer. For instance, we have seen that the rejection procedure of the Metropolis 
algorithm depends on the ratio A(y, x)/A(x, y), with A(x,y) = Tr(x)P(x,y): since 
the candidate y for the new state is chosen with probability P(x,y), the question 
is to know whether P(x, •) gives advantage to states that are more likely for tt. If 
so, then this latter ratio will often be above one, yielding a high acceptance rate. 
This is especially clear if P is chosen to be symmetric, in which case the acceptance 
probability is min(l, 7r(y)/n(x)). In other words, a good algorithm favors candi- 
date states that yield a good acceptance probability. As for the construction rate, a 
trade-off has to be made between the construction rate itself and the computational 
cost. Heuristically, the algorithm that generates a new polymer works by growing 
or recoiling one step at a time, until the polymer under construction has reached 
the desired length. In particular, this algorithm lives in the space of incomplete 
self-avoiding paths. This space is huge, especially if the dimension is large or if the 
polymer is long. The trade-off is therefore the following: if one allows the algorithm 
to visit all the space of incomplete self-avoiding paths, the construction rate will 
be the best possible, but the computational cost may be incredibly expensive. By 
truncating the set of incomplete self-avoiding paths which the algorithm is allowed 
to visit, one reduces this computational cost at the expense of reducing the con- 
struction rate as well. 

The RG* algorithm has two salient features that yield both high construction 
and acceptance rates: the concepts of feeler and of underlying graph. Though these 
two concepts are combined in the RG* algorithm, it is worth emphasizing that they 
can be considered and defined independently from one another. The idea of feeler 
makes it possible to define an algorithm that generates a polymer step by step by 
looking a few steps ahead. This way, the algorithm is able to detect dense areas 
where it becomes hard to grow the polymer, and, if necessary, to recoil from such 
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regions. The feeler is characterized by a parameter £, the length of the feeler, that 
tells how far away the algorithm is allowed to look. Large values of I yield a better 
construction rate, but a higher computational cost. 

As we will see, the algorithm that attempts to grow a new polymer works by 
increasing or decreasing a partial polymer by only one vertex at a time. In partic- 
ular, this algorithm can be defined on any oriented graph: all that matters is to 
know the potential growth directions, or equivalently, the neighbors of the endpoint 
of the current partial polymer. However, the properties of the algorithm, namely 
the computational cost and the construction and acceptance rates, are crucially 
affected by the graph on which it is run. 

The set of underlying graphs is the class of graphs on which the growth procedure, 
that makes use of the concept of feeler, will be run. Intuitively, we see that the more 
neighbors arc allowed at each step, the higher the computational cost. Initially, 
everything happens on a <i-dimensional grid, in which every vertex has Q = 2d 
neighbors. An idea to lower the computational complexity is to impose vertices in 
an underlying graph to have the same out-degree k < Q, where k is a parameter of 
our algorithm to be adjusted. 

With these remarks in mind, a simple sketch of the RG* algorithm is as follows. 
First, choose an old polymer and remove it from the system. Then generate at 
random an underlying graph, and try to grow a candidate polymer on this under- 
lying graph using the idea of feeler. If the growth is not successful, then go back 
to the initial state. Otherwise, toss a coin to determine whether to accept or not 
this candidate. Clearly, the generation of an underlying graph conditions the fur- 
ther growth of the candidate polymer. In particular, the probability of acceptance 
depends on the underlying graph generated, which makes the connection with the 
auxiliary variable method. As we will see, delving into the details of every of these 
steps will require a lot of work. It is however very important to keep this simple 
sketch in mind. 

The rest of the paper is organized as follows. In Section 2, we define the notations 
and the global framework, and give an overview of the concepts of underlying graph 
and feeler by comparing the RG* algorithm to two simpler algorithms. Section 3 
then explains the RG* algorithm in details and computes the generating proba- 
bilities of the objects under consideration. We then prove in Section 4 that the 
Markov Chain run by the algorithm has the right stationary distribution, and find 
the correct probability of acceptance. Some considerations about the irreducibility 
of the algorithm are given in Section 4.2, and some possible extensions of the RG* 
algorithm are studied in Section 4.3. Finally, Section 5 is devoted to issues related 
to practical implementations of the algorithm. 



2. Framework 

2.1. General settings, notations. In all that follows, we are working on a finite 
<i-dimensional lattice Q in which each vertex has Q — 2d neighbors. Q will be 
referred to as the underlying lattice. To make sure that the lattice is finite and that 
each vertex has exactly Q neighbors, Q is endowed with a torus structure. So if one 
thinks of Q as a cube in dimension d, some vertices "go around". For d = 1, Q is a 
circle, and for d = 2, it is a torus. 
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More formally, the set of vertices of Q is (Z/aZ) for some fixed a, and two 
vertices are neighbors in Q if all their coordinates are equal, except for one in which 
they differ by 1 or -1 (in Z/aZ). 

On this lattice, our system consists of N polymers of fixed length. A polymer 
(v\, . . . , Vl) is a self- avoiding path of length L >1, the length being the number of 
vertices. Moreover, the same physical constraints that impose the polymers to be 
self- avoiding impose that two different polymers cannot intersect. We denote by S 
the state space of all the possible configurations, so S can be written: 

S = {(Cfc), k = 1 . . . N, Ck is a self-avoiding path of length L 

and C t n ■ = for i ^ j} 

Figure 1 shows a particular state of a system with N = 100 polymers of length 
L = 25 each and in dimension d = 2, so that the underlying lattice is actually a 
torus. Note that some polymers, which are in one piece on the torus, are discon- 
nected in such a plane representation. 

When L = 1, polymers are reduced to a vertex; for L = 2, the polymers are 
actually called dimers. A fair amount of literature exists on the so-called dimer- 
monomer problem, but none of them covers our case. There is a total of a d vertices, 
and the polymers occupy NL vertices: it is easy to see that as soon as NL < a d , it 
is possible to cover the underlying lattice with the specified polymers. Indeed, there 
clearly exists a self-avoiding path of length a d that covers Q: this is just a matter 
of suitably enumerating (Z/aZ) d . Then, you may divide this path into N pieces 
of length L each, and you have a particular state for a system that satisfies NL = a d . 

Since a polymer is an undirected self-avoiding path, we can think of a polymer C 
as an undirected subgraph of Q. However, we will sometimes need to give an orien- 
tation to the edges. This orientation is naturally induced by choosing an endpoint 
of the polymer. If C = (wi, . . . , Vl), then the choice of v\ will induce the orientation 
Vi — > v 2 — > . . . — > Vl, while the choice of vl induces vl — > Vl-i «i. 

We denote by q(-) the targeted probability distribution on our system. Our aim 
is to run a Markov chain that has q as stationary distribution. Typically, each 
state S £ S is assigned a probability q(S) = Z~ 1 e~ E ( s * > where E is an energy 
function, and Z is the normalizing constant, also called partition function. As in 
the Metropolis algorithm, our probability of acceptance will depend only on ratios 
of values of q, thus making no need to know Z . 

2.2. Main ideas for the growth of a polymer. The RG* algorithm tries to 
replace one polymer at each step. Hence the main difficulty that it has to address 
is the following: given N — I polymers that do not intersect each other, how do 
we grow a new one? In this section, we emphasize the two salient features of the 
RG* algorithm introduced earlier: the concepts of feeler and of underlying graph. 
In order to understand why these two ideas give very good results, it is helpful to 
have in mind the two following extreme algorithms. 

The first algorithm is an exhaustive algorithm, and works as follows. First, we 
pick a free vertex v\, i.e., a vertex that is not occupied by any of the N — 1 polymers. 
Then we take a free vertex v 2 neighbor of v\ in Q: we have constructed a partial 
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polymer (1)1,1)2) of length 2, and we can continue from v 2 - Then one of the two 
following events happens. 

Either we always have the possibility to pick up a neighboring unoccupied vertex, 
and we construct a polymer (vi, . . . ,Vl) which has the desired length L. Then we 
are done. 

Or at some point, we have grown a partial polymer (vi, . . . ,Vi) with 1 < i < L — 1 
such that Vi has no free neighbor: each neighbor either belongs to some other 
polymer or is one of the preceding vertices Vi, . . . , t>j_i. In this case, we recoil to Vi-i 
and consider again the partial polymer (vi, . . . , Vi-i), but this time, we do not grow 
towards Uj. The same discussion applies again: either we have another acceptable 
direction to grow the partial polymer, or we have to recoil to v^ 2 , ■ ■ ■ Observe that 
in this example, if from Vi-i we try another direction v[ ^ Vi such that from v[ 
we have to recoil again, then from two directions would now be forbidden, 

namely Vi and v[. 

Finally, we stop either when we have grown a polymer of length L, in which case 
we successfully grew a polymer, or when we have to recoil from the very first vertex 
v\, in which case the step was a failure. In the former case, we have randomly 
chosen one possible polymer starting from v\, while in the latter case, there was no 
acceptable polymer starting from v\. 

This algorithm is therefore efficient in the sense that it grows successfully a poly- 
mer whenever it is possible. The chain construction rate is thus the best possible. 
However, due to an exhaustive search when growing the new polymer, this answer 
may come at the cost of very long computations, especially if Q or L is large, and 
if the system is dense. 

At the other extreme, the second algorithm is very fast and works identically 
except that it forbids recoiling. In this algorithm, we keep picking up free neigh- 
bors at random, until cither we have grown an acceptable polymer, or we have no 
free neighbor around the extremity of the current partial polymer. This algorithm 
quickly terminates, because it does at most L — 1 trials, but the chain construction 
rate is very low. 

The concept of feeler makes it possible to tune the RG* algorithm between these 
two extremes. Another way to look at these two algorithms is the following: for the 
exhaustive algorithm, no vertex on the current partial polymer (vi, . . . ,Vi), except 
v\, is sure to be in the final polymer, if there is one. Indeed, it may happen that 
we recoil all the way back to Vi, and from vi, try another direction v 2 ^ v 2 . For 
the fast algorithm, as soon as a vertex is added to the current partial polymer, it 
is sure to be in the final polymer, if there is one. 

In the RG* algorithm, a partial polymer (vi,... ,Vi) can always be broken up 
into two parts: the first part (vi, . . . , v a) is sure to be in the potential final polymer, 
while the extremity («a+i, • • ■ ,d%) serves as a retractable feeler. In other words, 
there is an index A such that the partial polymer cannot recoil below «a, and this 
index is increasing according to certain rules that we describe in details in Sec- 
tion 3.3. The existence of a retractable feeler makes it possible to recoil from dense 
areas, thus improving the chain construction rate, while the existence of this index 
A keeps the computational complexity reasonable. 
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Figure 2: Example of an underlying graph in dimension 2 with parameter k = 2. 
The dashed lines represent the underlying lattice. We can see that each vertex has 
exactly k — 2 out-edges. The dotted vertex is the root. 



It is essential to observe that the three algorithms described above do not depend 
on the graph they are run on. In particular, they can be run on any oriented graph. 
Indeed, all that matters is to know which directions are allowed to try to grow the 
polymer. However, even if these algorithms could be run on any graph, this graph 
plays an important role. For instance, we have seen that the exhaustive algorithm 
is intractable when Q is high, because the number of paths explored may be of order 
exponential in Q. But since the algorithm that generates a polymer may be run on 
any graph, it is therefore a very natural idea to run it on some oriented subgraph 
G instead of Q itself. The graph on which the algorithm is actually run is precisely 
what we call the underlying graph. Since we consider subgraphs of Q, the out- 
degree of each vertex in an underlying graph is at most Q. The simplest underlying 
graphs, and the ones that we consider until Section 4.3.2, are graphs that have 
a constant out-degree k, meaning that each vertex of the subgraph has the same 
out-degree k < Q. For k < Q, the sets of allowable directions are actually reduced, 
leading to a gain in complexity. However, by forbidding some directions that could 
potentially lead to an admissible polymer, the construction rate is reduced as well. 

Figure 2 shows an example of an underlying graph with parameter k = 2 in 
dimension d = 2. We see that each vertex has indeed exactly two out-edges, and 
some edges are curved to highlight the torus structure of the underlying lattice. 
Finally, one can see a dot on a vertex: this vertex is the root of the underlying graph. 
It is a special vertex that indicates where to begin the growth of the polymer. More 
details on the structure and generation of underlying graphs are given in Section 3.2. 

Similarly as for £, the length of the feeler, the higher k, the higher the construc- 
tion rate and the higher the computational cost. So here again, a trade-off has to 
be made. 

3. The Recoil Growth Algorithm to generate multi-polymer systems 

3.1. Overview. With the heuristic description made in Section 2.2, we can now 
give an overview of the RG* algorithm. At the beginning of the algorithm, the 
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N polymers are placed on Q, for instance as the result of the procedure given in 
Section 2.1, but any other state is fine. 

Assume that the system is currently in the state Sold, defined by the configu- 
rations of the N different polymers on the underlying lattice Q. The first step of 
the algorithm is to choose a polymer G id uniformly at random among the ./V pos- 
sible, and to remove it from the system. In this modified system, with only N — 1 
polymers left, we now try to grow a new polymer C new . In order to do so, we first 
generate at random an underlying graph G ncw , then we try to grow a new polymer 
Cnow on G ncw : this is the RG* procedure. At this point, two things may happen: 

(i) the growth is not successful: then this step is a failure, and we go back to the 
original state S id by putting back C id- 

(ii) the growth is successful: we now have two states S id and S new that only differ by 
one polymer, C id in S id and C n0 w in S ne w According to the Metropolis algorithm, 
the right stationary distribution q is obtained by accepting the new state S ncw with 
a suitable probability. But here, by conditioning the generation of G now according 
to a graph G new , we introduced a skewness between S id and S new - By analogy with 
the auxiliary variable method, we need to generate an underlying graph G id that 
plays for S id the role that G ncw plays for S new . This generation is the next step of 
the RG* algorithm. Then the right probability of acceptance is not the probability 
of accepting S now compared to S id, but the probability to accept {S ncw ,G new ) 
compared to (Sold, G G id) that we denote by P a cc((S id, G G id), (S new , G„ ew )) ■ To 
compute this probability, we first need to compute the weights of G id and G ncw 
on Gold and G now respectively. Weights are complex objects that are defined in 
Section 3.3. Because they are directly connected to the probability P g (C|G) of 
generating the polymer G on the underlying graph G, they are nonetheless necessary 
in order to compute the right probability of acceptance. 

For the sake of notations, in the remainder of this paper, the subscripts old and 
new will be noted o and n respectively, so for instance G denotes C Q id- 
One step of the RG* algorithm can be summarized as follows: 



1. Choose a polymer G w.p. 1/N and remove it from the system 

2. Generate at random an underlying graph G n w.p. P u (G n ) 

3. Run the RG* procedure on G n 

(a) If the procedure did not successfully grow a new polymer: 

i . Put back G 
ii. Go to 1. 

(b) Else, we have grown a new polymer G n w.p. P g (C n \G n ): 

i. Generate at random an underlying graph G compatible 
with G w.p. P C (G |G ) 
ii. Compute the weights W(C n \G n ) and W(C \G ) 
iii. Set p = Pace ((S , G ), (S„,G n )) , flip a p-coin: 

(A) If heads, add G n to the polymer system 

(B) If tails, put back G 
iv. Go to 1. 
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With this description of our algorithm, it is straightforward to write down the 
transition probability P(S — ► S n ) from S Q to S n where S and S n differ by only 
one polymer C in S and C n in S n : 

(1) P(S ^S n ) = l ]T Pn(Gn)P g (C n \G n )P c (G \C )P, cc ((S ,G ),(S n ,G n )). 

G n , G 

Each term of this sum will be computed below. 

A complete description of the RG* algorithm, to be given next, is as follows: (i) 
define precisely the concept of underlying graph as well as the notion of compati- 
bility between an underlying graph and a polymer; (ii) explain the RG* procedure, 
in which the concept of feeler plays a central role, and finally (iii) define the weight 
of a polymer on an underlying graph, which makes it possible to to compute the 
right probability of acceptance. 

3.2. Underlying graphs. 

Definition 1. An underlying graph G C Q is a directed subgraph of Q rooted at a 
vertex r such that: 

1. for any vertex v E G\{r}, there exists a directed path from r to v 

2. each vertex has exactly k out-edges 

Remark: Several vertices may satisfy the same property as the root, namely ac- 
cessibility to all other vertices. However, the root plays a special role, as it will be 
the starting vertex for the growth of a polymer. 

As mentioned earlier, 1 < k < Q is a parameter of the algorithm. Note that 
when k = Q, every underlying graph spans the whole underlying lattice with every 
possible directed edge induced by Q. So in this case, an underlying graph is simply 
characterized by its root. When k = 1, an underlying graph is just an oriented 
path starting at the root, and ending as soon as a loop appears. Figure 2 shows an 
example of underlying graph with k = 2 in dimension 2. The dotted vertex is the 
root, and one can check that there exists a path from the root to any vertex of the 
underlying graph. 

Definition 2. We say that a polymer C and an underlying graph G are compatible 
if the root r of G is one of the two extremities of C and if C C G ( where the choice 
of r induces the orientation on C as explained in Section 2.1). 

The class of underlying graphs is the natural class of graphs to run the RG* 
procedure; as we will see, a polymer can be grown on some underlying graph if and 
only if they are compatible. 

3.2.1. Procedure to generate an underlying graph. After choosing a random polymer 
to be removed from the system, the first step of the RG* algorithm is to generate 
an underlying graph. The growth of the candidate polymer will be performed on 
this underlying graph. 

To generate an underlying graph, we proceed iteratively, taking care that every 
vertex added to the graph has exactly k out-edges. This algorithm is basically 
a greedy algorithm that at each step assigns out-edges to some randomly chosen 
vertex. 
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More precisely, we have two sets V and T: V is the set of vertices of the final 
underlying graph that we are trying to generate, whereas T is a waiting room for 
the vertices that will eventually end up in V, but have not been assigned out-edges 
yet. 

Initially, we choose the root r uniformly at random in the set of vertices that 
are not occupied by any of the N — 1 polymers in the system. Thus T = {r} and 
V = because we have not assigned out-edges to the root yet. The first step is 
to assign k out-edges to r: r has Q neighbors in Q, and we choose k out-edges 
out of these Q possible uniformly at random. Call v\ , . . . , Vk the corresponding 
neighbors or r. Then we have assigned out-edges to r but not to the k new vertices, 
so T = {v\, . . . , Vk} and V = {r}. Then we take any vertex in T, say v\, and we 
choose its k neighbors. Since r and v\ are neighbors in Q, it may happen that r 
is among these neighbors, so we have to be careful: we add to T the vertices that 
are not already in V or in T, and we move v\ from T to V. We keep on assigning 
out-edges to vertices in T until T = 0. 

We see that in this algorithm, except for the root, the current configuration of 
the system is not taken into account. We add vertices and out-edges independently 
of the polymers on Q. This leads to some inefficiency, and we explain in Section 5 
how to remedy to this problem in practice. 

When T is empty, it is clear that each vertex in V has exactly k out-edges, and 
that there exists a directed path from r to any vertex: we have indeed generated 
an underlying graph. Observe that there is no constraint on the in-degree of each 
vertex. However, since there is a path from the root to any vertex but the root, 
each vertex except the root has in-degree at least 1. The root is the only vertex 
that may have in-degree 0. 

In the following pseudo-code that explains how to generate an underlying graph, 
E is the set of edges we are trying to generate, and V and T arc as in the above 
informal description: 



1. Pick a free vertex r at random, and initialize as follows: 

(a) V = E = % 

(b) T={r} 

2. While T ^ 0: 

(a) Pick any vertex v in T 

(b) Choose k different neighbors (ui,...,i>fc) of v uniformly at random 

(c) For i = l...k 

i. Add the edge v — > Vi to E 
ii. If Vi £ V and v t £ T, add v t to T 

(d) Remove v from T and add it to V 



Since the underlying lattice Q is finite, it is clear that this procedure terminates 
in finite time. Moreover, it is easy to see that this procedure does not depend on the 
order in which you pick up vertices in T: once a vertex is in T, its set of out-edges 
does not depend on when you assign them. Figure 3 page 12 shows the six first 
steps of the generation of an underlying graph in dimension two with k = 2. 
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(e) (f) 



Figure 3: The six first steps of the generation of an underlying graph, (a): the root 
r is the dotted vertex, which we pick uniformly at random among the unoccupied 
vertices. The edges represent the N — 1 = 2 polymers left in the system. In the 
sequel of the generation of the underlying graph, the polymers play no role, so they 
are not drawn, (a)-(f): vertices with a cross will be in the final underlying graph, 
but have not been assigned out-edges to yet: they form the set T. At each step, 
we pick one vertex in T at random, and assign out-edges to it. The vertex then 
becomes part of the set V, and we remove the cross. If by assigning out-edges, we 
have discovered new vertices, we add these vertices to T. In (f), the generation is 
not over, since T has four more vertices. 
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Proposition 1. The probability P U (G) to generate the underlying graph G is given 
by 

(2) P U (G) = 

7 

where a — (?) , \G\ is the number of vertices in G, and 7 = a d — (N — 1)L is the 
number of free vertices when N — 1 polymers are in the system. 

Proof. The generation of G is unambiguous: the root is chosen with probability 1/7; 
then, each time we assign out-edges, we have to assign the right set of out-edges. 
Since there are 1/a such sets and the choice is uniform random, the probability of 
assigning the right set of out-edges is exactly a. Such a choice occurs for each of 
the \G\ vertices, hence the result. □ 

3.2.2. Procedure to generate a compatible underlying graph. As we have seen in the 
overview of the RG* algorithm in Section 3.1, the transition between the old and 
the new state is made symmetric by generating an underlying graph compatible 
with the old polymer C , as a comparison to the graph G n . 

The procedure to do so is essentially the same as to generate any underlying 
graph: the only difference is that when we try to assign out-edges to vertices that 
belong to C Q = (vi, . . . , v L ), we must take care that for each 1 < i < L — 1, we do 
have Vi — » v i+ i in the set of out-edges. Hence for these L — l vertices, one out-edge 
is imposed, and we have to choose the k — 1 others among Q — 1 possible. Hence 
we can derive the following property similarly as for general underlying graphs: 

Proposition 2. Given a polymer C of length L, the probability P C {G\C) to generate 
an underlying graph G compatible with C is given by 

\G\-L+loL-l 

(3) P C (G\C) = * ^— 

where (3 = (?Zi) an d a an d | C | art as in Property 1. 

Remark: P C (G\C) and P U (G) are proportional: we have P C (G\C) — nP u (G) where 
77 = 7/2 • is a universal constant that does not depend on the polymer or 

the underlying graph. 

3.3. Recoil Growth Procedure. In this section, we describe in details the pro- 
cedure that grows a polymer on a given underlying graph G. As explained before, 
this growth takes place on an underlying graph that has a constant out-degree k. 
Before describing the dynamics according to which the algorithm either extends an 
existing partial polymer or recoils, we have to understand the concept of feeler. 

3.3.1. Decomposition of a partial polymer C. During the growing process, a par- 
tial polymer C = . . . , v{) of length i < L can always be decomposed into two 
disjoint parts: the fixed part and the feeler. Before stating the definition of this 
decomposition, it is important to have in mind the rule that this decomposition 
stands for: 



Rule: During the growth process, we are allowed to recoil from a vertex if and only 
if it is in the feeler. 
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This means that if at some point we need to recoil from a vertex that belongs to 
the fixed part, the attempt is a failure. Since the feeler is the complementary part 
of the fixed part, one only needs to define the fixed part: 

Definition 3. A node Vj is in the fixed part if j = 1 or if during the history of the 
growth of C , there was a partial polymer (vi, . . . , Vj, Vj+i, . . . , Vj + i). 

Equivalently, the fixed part is the set of vertices v such that either v is the 
starting vertex of the polymer, or at some point in the growth history, a partial 
polymer has been grown through v and has been £ steps further. The previous 
definition immediately yields the following properties: 

Proposition 3. There exists an increasing index A such that the fixed part is of 
the form (v\, . . . ,va)- 

Proof. Consider a vertex vj in the fixed part, and the corresponding partial polymer 
C = (vi, . . . , Vj, Vj+i, . . . , Vj+i). Since the RG* procedure extends or shortens a 
partial polymer only one vertex at a time, this implies that necessarily, at some 
point, the shorter polymer (vi, . . . , Vj-\, Vj, . . . , Vj+i-\) was grown: this exactly 
means that j — I is in the fixed part as well. This proves that the fixed part is 
indeed of the form (v\, . . . , v\). 

Moreover, it is clear by definition that if a vertex belongs at some point to the 
fixed part, then it stays in the fixed part for the remainder of the growth process: 
the fixed part is increasing, hence so is the index A. □ 

Proposition 4. The feeler is of the form (ua+i, ■ • • , and is of length at most 
£ (and is possibly empty). 

Proof. By the previous proposition, all we have to show is that the feeler is indeed 
of length at most £. Imagine for a moment that the feeler is of length £ + 1: this 
means that we have a partial polymer C — (vi, . . . , va, va+i, • • • , VA+e+i) 

where A delimits the fixed part. But then, by definition of the fixed part, the 
vertex A + I should belong to the fixed part, which is not possible. □ 

We can now draw a picture of the way the fixed part and the feeler evolve. If 
C = (vi,...,Vi) grows towards Uj+i, the feeler is increased except when it was 
already of length I. If the feeler is of length t when C grows, then it stays of 
length £, and the fixed part grows. It is then straightforward to derive the formula 
A = max(l, L max — £), where L max is the length of the longest polymer grown in 
the history of C. 

Note that vertices Vj with j > L — £ are never in the fixed part, because we never 
grow a polymer of length larger than L. 

By playing on £, we can have the two extreme algorithms described in Section 2.2. 
For t = 0, we get the fast algorithm, because since the feeler is of length 0, there 
is no vertex from which it is allowed to recoil. And if one sets £ = L, it is always 
allowed to recoil from any vertex, and we get the exhaustive algorithm. So this 
parameter is the parameter that makes it possible to make a trade-off between the 
speed and the tractability of the algorithm. 

We can now fairly simply describe the RG* procedure given an underlying 
graph G. 
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3.3.2. Description of the RG* procedure. Given an underlying graph G, the RG* 
procedure keeps record of the current partial polymer C, the length L max of the 
longest partial polymer ever grown, and L sets Di. For a partial polymer of length 
at least i, Di represents the set of vertices that the vertex Vi has already tried 
as directions of growth. In particular, at any time, Di is a subset of the set of 
neighbors of in G. These sets are here to insure that the algorithm is free of 
loop: when you recoil to a vertex Vi, you must try a new direction, that you have 
not tried so far. If at some point, you recoil to Vi and every neighbor of Vi is in 
Di (equivalently, \Di\ = k), this means that you have again to recoil from Uj. This 
is possible only when Vi is in the feeler, which is easily verified by checking the 
condition i > max(l, L max — £). 

The RG* procedure works as follows: 



1. Initialization: 

(a) C = (v\), where V\ is the root of G 

(b) D 1 = 

(c) Lmax = 1 _ 

2. At each step, with C — (vi, . . . , Vi) : 

(a) If i = L, stop, C is a complete polymer 

(b) Else if \Di\=k, recoil: 

i. If i > max(l,i max — t), set C = (vi, . . . , and go to 2 

ii. Otherwise, stop, the generation has failed 

(c) Else, keep picking uniformly at random a vertex v neighbor 
of Vi in G and not in Di, add it to Di, and stop when either 
v is unoccupied or \Di\ = k: 

i. If v is unoccupied, set C = {v\, . . . , V{, v) , -Dj+i=0, update 
L max and go to 2 
ii. Else, \Di\=k, and recoil as specified in 2.(b) 



It is not hard to show that this procedure is free of loop, and therefore terminates 
in finite time. The absence of loops is insured by the sets Df. you cannot grow 
twice the same polymer because this would imply to choose as next direction a 
vertex that already belongs to some set Di . 

Figure 4 on page 16 shows the successful generation of a polymer in dimension 
d — 2, with parameters N = 3, L = 5, k = 2 and 1 = 2. 

With this description, it is clear that if the RG* procedure returns a polymer C, 
then C and G are compatible. Hence the following property holds: 

Proposition 5. If C and G are not compatible, then P g (C\G) = 0. 

We still have to compute the probability P g (C\G) when C and G are compatible. 
To get an hindsight of what the right answer is, let us consider the case £ = 0: write 
C = (v\, . . . , Vl), and consider any 1 < i < L — 1. In the process of growing C, we 
have necessarily grown at some point the partial polymer (vi, . . . , Vi), and from Vi, 
we have k choices. But since t = 0, any choice different than Vi + i will not lead to C. 
Indeed, if we choose v' ^ Uj+i, since I = 0, we will never recoil below v' , therefore 
we will not grow C. In opposition, if we choose v i+ i for each 1 < i < L — 1, we will 
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(a) L n 



(b) L n 



(c) L n 




(d) L n 



(e) L n 



(f) in 




(g) Ln 



(h) L n 



(0 ^0 



Figure 4: Example of the growth of a polymer with parameters d ~ 2, N = 3, 
L = 5, k = 2 and ^ = 2. The 2 polymers in the system are represented by the bold 
edges, the dashed arrows represent the underlying graph, (a): we start from the 
root, (b): we try the neighbor to the right, and we update the counter L max to 2. 
This vertex is occupied by another polymer, so we recoil to the root. Since I > 0, 
this move is allowed, (c): we try another direction, and go up, which is the only 
direction left possible since the cross symbolizes the set D\ of directions already 
tried, (d): we try the left vertex, and update L max . (e): we try the down vertex, 
and update L max . This vertex is occupied by another polymer, so we recoil, (f): 
we try the right vertex, which is occupied by the current polymer. Since this is the 
second try and k = 2, we have to recoil again. Since I > 1, this is admissible. If 
£ = 1, this step would yield a failure, because max(l, L max — 1) = 3, and we want 
to recoil from v 3 . (g)-(i): we try admissible neighbors that lead to a full polymer. 
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grow C: this happens with probability (l/k) L 1 , which is the answer for £ = 0. 

In the case £ > 0, we have still necessarily grown at some point the partial 
polymer (v\, . . . ,Vi), and from v i} there are still k potential choices. But in this 
case, if we choose a vertex v ^ this is not necessarily the end, because it may 
happen that we recoil back from v to Uj. In this case, and when k > 1, we then 
have a second opportunity to pick the right vertex Uj+i. 

Since £ > 0, when we grow from to v, w is in the feeler part, and we will 
eventually recoil back to Vi if and only if v never belongs to the fixed part. Since 
by definition, v belongs to the fixed part if and only if at some point, we grow the 
chain £ steps further, we get that we will recoil back to Vi if and only if the partial 
polymer (vi, . . . , v i7 v) cannot be extended £ steps further on G. Note that this 
equivalence is true only for vertices that potentially may belong to the fixed part, 
i.e., for vertices v% with i < L — £. 

For i > L — £ + 1, this equivalence becomes: we will recoil back to Vi if and only 
if the partial polymer (v\, . . . , Vi, v) cannot be completed. 

Definition 4. For 1 < i < L — 1, a neighbor v of Vi is called admissible if either 
i < L — £ and the partial polymer (vi, . . . , v i7 v) can be extended I steps further, or 
if i > L — £ and (vi , . . . , Vi, v) can be completed to an entire polymer. 
The elementary weight Wi of Vi is the number of admissible neighbors of Vi . 

Remark: In this definition, when the partial polymer (vi, . . . , Vi, v) is extended, 
possible interactions with the remaining part (v i+ i, . . . , vl) of the polymer are not 
taken into account. Moreover, note that when C and G arc compatible, t>j + i is 
always admissible, because we know that (v\, . . . , Vl) = C is possible. Hence in this 
case, we always have Wi > 1. And now we can compute the probability P g (C\G): 

Definition 5. W(C\G) — w i * s ^ e weight of the polymer C given the un- 

derlying graph G. 

Proposition 6. Given the underlying graph G compatible with C , we have 



Proof. Let i € {1, . . . , L — 1}. In the process of growing C, the partial polymer was 
at some point (v\, . . . , v{). By definition, if we choose as next vertex v, then we will 
eventually recoil back to Vi if and only if v is not admissible. Hence the probability 
from Vi to make the right choice is one over the number of admissible neighbors, 



Remark: To compute the elementary weight Wi , we have no choice but to use the 
exhaustive algorithm presented previously. Indeed, we need to know whether under 
certain constraints, we can grow a self-avoiding path of length £. So if £ is large, 
since we have to do this at most (k — 1)(L — 1) (when for each 1 < i < L — 1, we 
must try to grow a feeler from each of its k — 1 neighbors) computing the weight can 
be expensive. However, this is only to be done when we have successfully grown a 
polymer. The complexity here is linear in k and L, but exponential in £. 




i.e., 1/wi. 



□ 
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4. Properties of the Recoil Growth Algorithm 

Definition 6. The density p = NL/a d of the system is the ratio of the number of 
vertices occupied over the total number of vertices. 

In this section, we prove the following theorem, where it is assumed that the size 
a of the underlying lattice satisfies a > L, which is the relevant case in practice: 

Theorem 1. With the right probability of acceptance P acc ((S 07 G Q ), (S n , G n )), the 
RG* algorithm is reversible with stationary distribution q. The algorithm is more- 
over irreducible at densities smaller than [a/Lj/a. 

To prove this theorem, we first deal with the stationary distribution in the next 
section, and then discuss some issues related to the irreducibility. Note that we 
prove the existence of a non-empty region for which both the original RG and the 
RG* algorithms are irreducible. 

4.1. Reversibility. 

Lemma 1. Take any two states S and S n in S that differ by only one polymer 
C in S and C n in S n , and take any two underlying graphs G and G n compatible 
with C and G n respectively. If we choose 

(5) P „ 4(S „, G „), (S „, Gn ) )=mto ( 1 ,i|ffl|^l 
then q satisfies the detailed balance equations: 

q(So)P(S - Sn) = q(Sn)P(S n - S a ). 

Proof. Recall that we have from (1) 

P(S ^S n ) = ^ Yl Pu(Gn)P g (C n \G n )P c (Go\Co)P acc ((So,G ),(Sn,G n )). 
G n . G 

We can show that q not only satisfies the detailed balance equations, but satisfies 
a stronger "local" condition: if we define 

P((S ,G ) — ► (S n , G n )) = P u (G n )P g (C n \G n )P c (G \C )P acc ((S , G D ), (S n , G n )) , 

then P(S -» S n ) = l/iVEG n , Go P (( S °> G °) (Sn,G n )), and it is clearly suffi- 
cient to show that for all S Q , S n , G and G n , q satisfies 

(6) q(S )P((S , G ) -» (S„,G„)) = q(S n )P({S n ,G n ) -» (S ,G )). 

But we have seen that P u (G n )P c (G \Co) = P u (G )P c (G n \C n ) because there exists 
a constant n such that P C (G\C) = nP n {G), so (6) amounts to 

Pacc((^o,Go),(Sn,G n )) = q(S D )W (C D \G n ) 
Pacc((S n , G n ), {S , Go)) q(S ) W(C \G ) ' 

what is true by choice of the probability of acceptance. □ 

4.2. On the irreducibility of the Markov Chain. Knowing that q satisfies the 
detailed balance equations is not enough for the algorithm to work properly: one 
needs results on the irreducibility of the Markov Chain as well. This question seems 
to be a hard problem, and we aim at giving some hints in this section. 
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4.2.1. Proof of theorem 1 . 

Lemma 2. For any d,N,L and a>L, the chain is irreducible if p < [a/L\/a. 

Proof. The idea is to show that from any configuration S G S, we can reach a 
particular configuration S where all the polymers are straight, and lie in some 
specific boxes. The boxes are defined as follows. 

Q = { r L/a r L) d is a d-dimensional cube, and for any x G (Z/aZ)^" 1 , we consider 
the subset F x = {y = (y l ) G Q : (yi, . . . , Vd-i) = x,y d G aZ} where the d - 1 
first coordinates are fixed. F x is in bijection with Z/aZ. Since a > L, we write 
a = nL + r with n = [a/L\ > 1 and < r < L — 1. For each x, we can then divide 
F x into n boxes, the first box being in bijection with {0, . . . , L — 1}, the second with 
{L, . . . , 2L — 1}, etc. . . The total number of boxes B is equal to a d_1 n, therefore we 
get, using the definition of p and the hypotheses: 

A = 2L> L 

7VX pa ~ 

But iVL is the number of occupied vertices, so the previous inequality means that 
we have more boxes than occupied vertices. We study only the worst case, where 
B = NL: one of two things may happen. 

If each box is occupied by exactly one vertex, then each box is intersected by 
exactly one polymer. So we can consider each polymer in turn, and put each one 
in some box that it occupies. Each polymer will then be straight and lie in some 
box. 

If some boxes are occupied by more than one vertex, then necessarily, some boxes 
are empty. Then we can consider the polymers in turn and assign them to empty 
boxes while we have empty boxes. We stop either when we have no more empty 
boxes or when every polymer has been assigned to a box. In the latter case, we are 
done. And it is easy to see that the former case never happens, because by putting 
a polymer in a box, we create empty boxes. 

In any case, we can go to a state where all the polymers are in boxes. Such 
states are clearly connected, and the chain is therefore irreducible. □ 

Remark: With regards to simulation, the longer the polymers, the better. In [3], 
the authors investigate cases when L = 100. So this result insures irreducibility 
at very low densities, around 0.01, compared to the values we would be interested 
in, like 0.6. Moreover, one question that is still open at this point is to give a 
lower bound independent of the parameters of the system. For instance, it seems a 
reasonable bet that the algorithm is always irreducible for p = 10~ 10 . 

4.2.2. Informal discussion. In this section, we give examples of parameters for 
which the algorithm is irreducible, some for which it is not, and we state some 
conjectures. 

First, observe that if two states 5* and 5" differ by only one polymer, then the 
probability P(S —> S') of going from S to 5" in one step of the algorithm is strictly 
positive for any k and t. These two parameters obviously influence the value of 
such a probability, but it will nevertheless always be positive. The question of 
irreducibility does therefore not depend on these two parameters. 

For purposes of irreducibility, the algorithm is then characterized by iV, L and 
Q. Recall that Q = (Z/aZ) d is a cyclic cube in dimension d. Since p = NL/a d , 
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Figure 5: State where all the dimers are horizontal, except for one column, which 
contains the "whole", i.e., the free vertex. 



we see that the RG* algorithm is characterized by five parameters N, L, p, a and d, 
so we adopt the notation RG^(N, L, a, p) to refer to the RG* algorithm with these 
parameters. Any one of these parameters is uniquely determined by the four others 
thanks to the relation p = NL/a d . 

Before going on, we rule out trivial cases that are of no interest: for N — 1 or L = 
1, the algorithm is always irreducible. So in the rest of this discussion, we always 
assume N,L > 2. The case L = 2 corresponds to the so-called "monomer-dimcr 
system", which has received considerable attention in the literature, e.g. [6, 14], 
with [6] dealing with problems close to the ones we are interested in. However, the 
Metropolis algorithm proposed in [6] allows to remove or add dimers, in which case 
the problem of irreducibility is trivial. 

Proposition 7 deals with a simple case: 

Proposition 7. If p — 1, the algorithm is never irreducible. 

Proof. Assume that p = 1 and you start from some state S. Then removing a 
polymer C leaves exactly L free vertices. Hence for any state S' where C has been 
replaced by C , C and C necessarily occupy the same set of vertices. In particular, 
two vertices v and v' neighbors in Q which belong to different polymers in S will 
never be part of the same polymer. Since there exist such states, RG^(N 7 L,a, 1) 
is not irreducible. □ 

The next two propositions will help us build our intuition: 

Proposition 8. For any p < 1 and any other parameters, RG* d {N,2, a, p) is irre- 
ducible. 

Proof. We deal with the case d — 2, and when there is only one free vertex. When 
there are more free vertices or when d > 2, the same ideas can be extended to prove 
that the system is still irreducible. So from now on, we assume d = 2 and that 
there is one free vertex. 

We show that from any state, we can reach the state depicted in Figure 5, where 
all the dimers are horizontal, except on one column where they all are vertical. The 
idea is the following: we consider the number k of dimers that are "vertical", i.e., 
the number of dimers that are aligned along a column and not along a line. We will 
show that there exists a path, i.e., a sequence of moves, along which k is decreasing. 
Figure 5 represents a state where k is equal to its minimal value. 

For the parameters considered, a 2 = 2N + 1, hence a is odd. Since only one 
vertex is free, there is only one line on which there is a free vertex. So for the 
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(a) Move a 



(b) Move b 



(c) Move c 



Figure 6: The three elementary moves: a and b do not change k, whereas c decreases 
K by one. 



a — 1 other lines, the a vertices are occupied by dimcrs. Since a is odd, and since 
a horizontal dimer occupies an even number of vertices, there is an odd number of 
vertical dimers that have exactly one vertex lying on this line. In particular, there 
is at least one vertical dimer on each of these a — 1 lines. 



We now describe the path along which k is decreasing. This path is made of the 
three elementary moves a, b and c depicted in Figure 6. We consider the line with 
the whole: there is an even number of vertices occupied, therefore there is an even 
number of vertical dimers sharing a vertex with this line, which is possibly null. 

If there are at least two such dimers, we can move to the right thanks to move a 
until the vertex to the right of the whole is a vertical dimer, as in case (c) of 
Figure 6. Then by applying move c, we can reduce k by one. The whole is then on 
a new line. 

If there is no vertical dimer on the line of the whole, we can change line: first, 
we apply move a to go to the right until the whole is under some vertical dimer as 
in case (b). There exists such a dimer, because there is an odd number of vertical 
dimers on the above line. Then we apply move b, and we jumped two lines. Since 
there is an odd number of lines, by jumping two lines at a time, we can put the 
whole on any line (this is the same reason why you can always go under a vertical 
dimer). 

So now, we know that from any state, we can reach a state where n is minimum. 
It is not hard to see that such a state is as follows: in any line except the line with 
the whole, there is exactly one vertical dimer. To reach the state of Figure 5, one 
just has to align these (a— 1)/2 vertical dimers: it is not hard to do so by combining 
moves a and c, so we are done. □ 



Proposition 9. RG^iN, 5, a, 80/81) is not irreducible. 

Proof. We have 80/81 = 5N/a 2 , which leads by simple considerations to TV = 16n 2 
for some integer n > 0, and a = 9n. Now we restrict our attention to the case 
n = 1, in which case there are 81 vertices, so only one vertex is free. Figure 7 
zooms on a region of a state of such a system, where the polymers have a specific 
configuration around the only free vertex. Every vertex in any of the dashed area 
is occupied. Hence from this state, two things may happen. 
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Figure 7: Example which shows that RG%(N, 5, a, 80/81) is not irreducible: there 
is no free vertex in the dashed areas, so polymers in the dashed areas cannot move. 
Nor can the polymers surrounding the only free vertex. 



If we remove a polymer that is in the dashed area, the only area where we can 
grow another polymer is in the region freed by this polymer. Hence in this case, 
every polymer still occupies the same set of vertices. 

But if we remove a polymer surrounding the free vertex, then it is easy to see 
that the only possible way to grow it is at the exact same place. Indeed, simple 
considerations show that a new polymer cannot occupy this free vertex, because it 
acts like a dead-end: from this vertex, the polymer would have to pick one of two 
possible directions, and whatever the direction chosen, there is not enough room to 
grow the polymer. 

Hence from this state, the only states reachable are states where each polymer 
occupies the same set of vertices, so the system is not irreducible. 

For n > 1, we can repeat over space the state obtained in the case n — 1 to obtain 
a state that has the exact same property, hence the proposition is proved. □ 

From these two examples, we see that the parameter p is not sufficient to 
characterize the irrcducibility of the system, meaning that it can happen that 
RG d (N, L,a, p) is irreducible, whereas RG* d {N' ', V ',a' ', p') with p' < p is not. How- 
ever, it seems intuitively that the critical parameter is actually the length of the 
polymers, and so one could wonder whether for fixed length, and for fixed dimen- 
sion d, the density is not enough to characterize the irreducibility. This idea is the 
object of the following conjecture: 

Conjecture 1. If RG d (N, L, a, p) is irreducible, then RG* d {N' ', L,a' ', p') is irre- 
ducible for any N' , a' such that p' < p. 

If this conjecture were to be true, then one could consider the quantity 

Pd{L) = sup{p : RG d (N, L, a, p) is irreducible}. 

N,a 
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Figure 8: Example of a state in RG2(N,L,a,p) that cannot be extended to 
RG* 2 {N + l,L,a,p), although RG%(N + l,L,a,p) is irreducible (N = 3, L = 2 
and a = 3). Circles represent unoccupied vertices. 



For any N, a, RG d (N, L,a, p) would be irreducible if and only if p < Pd(L), and 
a second conjecture, that would reflect the fact that the longer the polymers, the 
harder it is for the system to be irreducible, would be: 

Conjecture 2. pd(L) is decreasing in L. 

Finally, we state the the most natural, and probably easiest, conjecture: it says 
that if the system is irreducible, then removing polymers, shortening the polymers 
or growing the box should leave the system irreducible. 

Conjecture 3. Suppose that RG d (N, L, a, p) is irreducible. Then RG* d {N' ', L', a', p') 
is irreducible for any N' < N,L' < L and a > a' , where the value of p' is determined 
by the other parameters. 

Note that this conjecture would immediately imply Conjecture 2. A natural idea 
to try to prove this last conjecture is the following: suppose for instance that you 
want to show that RG d (N ~ l,L,a,p') is irreducible, given that RG d (N,L,a,p) 
is. Then you could add a fake polymer, make any move in RG d (N, L, a, p), remove 
the fake polymer, and you are done. However, it is not always possible to add a 
polymer to RG d (N — l,L,a,p'), even if RG d (N,L,a, p) is irreducible, as shown 
in the simple example on Figure 8. Hence, Conjecture 3 would be true if from 
any state in RG* d (N — 1, L, a, p), we could reach a state where we can add a fake 
polymer. Similar ideas hold for L and a. 

4.3. Extensions of the Recoil Growth Algorithm. 

4.3.1. Polymers of different lengths. All what we have done still holds when the 
polymers have different lengths, under the reasonable condition that we always try 
to replace a polymer by another polymer with the same length. Then it is easy to 
check that all the above formulas still hold, where each time L is to be replaced by 
the length Lc of the polymer C that we are trying to replace, and the constant 7 
is to be replaced by a number jl c . 
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When the polymers have different length, a new problem arises which we have not 
mentioned so far: the polymers are then de facto distinguishable. Distinguishability 
influences the irreducibility of the algorithm. A trivial example is when L — 1 and 
p = 1, i.e., the full lattice is covered by its N vertices as N polymers: if the polymers 
are indistinguishable, the system is irreducible, because there is only one state. But 
if they are distinguishable, then the system is not irreducible, because there are N\ 
isolated states. 

4.3.2. Underlying graphs with random out-degrees. In this section, we study an ex- 
tension proposed in [3] that relaxes the constraint that k needs to be integer- valued, 
and adopt it in our framework. Although for the original RG algorithm, this does 
not require to change the probability of acceptance Pace, for the RG* algorithm, 
it does. We compute below the correct value for this new algorithm, to which we 
refer as the extended RG* algorithm. 

In this extension, the out-degree of each vertex of an underlying graph is random 
instead of deterministic, so the definition of an underlying graph given earlier does 
not apply in this section. However, the generation of underlying graphs, compatible 
or not, is very similar as in the case where the out-degrees are deterministic. There 
is only one additional step: instead of assigning k or k — 1 uniformly at random, 
we first draw a number 1 < k < Q with probability p K , and then we pick k or 
k — 1 vertices uniformly at random. The probability distribution p is therefore an 
additional parameter to the algorithm. Note that we impose po = 0, because in the 
process of growing a compatible underlying graph, we have to assign at least one 
out-edge to the vertices on the polymer. 

The interest of this extension is to allow any mean random degree < k >= Kp K , 
so that the optimization of the algorithm can be more efficient. 

Definition 7. W°(C\G) is the weight of C on the compatible underlying graph G 
under £ = 0. In other words, W°(C\G) = Yli=i d{vi) if C = (vi, . . . , Vl) and d(v) 
is the out-degree ofveG. 

We use the same notations as in Lemma 1 to state the result of this section: 

Lemma 3. If one sets 

(7) P ((S G)(S C W minfl ^ W ^\ G ^ W °^\ G ^ 1 \ 

then q is the stationary distribution of the extended RG* algorithm. 

Proof. We show that q satisfies the same "locale" balance equations (6). The 
probability of acceptance P acc is given, so we have to compute the new probabilities 
of generating our objects under the extended RG* algorithm, which correspond to 
formulas (2), (3) and (4). 

Since only the generation of the underlying graphs changes in this algorithm, 
formula (4) still holds with the same definition for the uii. Note that Wi is not 
bounded by k anymore, but rather by the out-degree d{vi) of Vi. More generally, 
we note d(v) the out-degree of v, with no possible confusion on the underlying 
graph that is considered. 



A VARIANT OF THE RECOIL GROWTH ALGORITHM 



25 



We need some notations: let G be a polymer, G a compatible underlying graph, 
and 1 < i < Q. We note on — (9) , ft = (9~i) ■ C is G minus its extremity that 
is not the root of G, h(G) = \{v e G : d(v) = i}\, q l {G\C) = \{v e G\C : d(v) = i}\ 
and qi(G\C) = \{v G C : d(v) = i}\. 

Now if we turn our attention to the generation of a general underlying graph, 
we see that formula (2) becomes 

1 1 Q 

Pu(G) = - ]J Pd(v)<Xd(v) = - \{{p l a l ) MG) - 

' veG ~ i=i 

Indeed, we assign the out-degree d to the vertex v with probability p d , and then, 
there are \jotd different ways to choose its neighbors. Note that = implies 
fcj(G) = 0, and we adopt the convention that 0° = 1. For the same reasons, 
formula (3) becomes 

1 1 Q r 

^c(gic) = 2 n Pd(v)a d{v) n P( ,(„)^(„) ^n[(P' a -) ,,(Gic) W')' ,(Gic) ■ 

vec\c vec i=1 

Using \C\ = L - 1, fcj(G) = %(G|C) + ^(G|C), ft - Qaj/i and TU°(G|G) - 
n„ e c = Hi i 9 ^ 6 ' 1 "', we finally can rewrite 

Pc(G\C) = P U (G)^°(G|G)- 1 . 
Putting all these formulas together, it is then easy to check that (6) amounts to 

Pacc((So,G ),(S n ,G n )) _ qjS^WiCnlG^W^CnlGn)- 1 

Pacc((S„,G n ),(S ,G )) g (5 )W(G |G )W°(G |G )- 1 • 
The acceptance probability chosen exactly satisfies this equality. □ 

Remark: When p is supported by only one point k, the terms W° cancel out, and 
we find the probability of acceptance for the standard RG* algorithm. 

5. IMPLEMENTATION 

We have mentioned at several occasions throughout this paper that the original 
algorithm in [3] is actually an efficient implementation of the RG* algorithm as 
defined in the present paper. In this section, we present this implementation, cast 
in our framework. 

The inefficiency of the RG* algorithm comes from the generations of the under- 
lying graphs. As we have defined them, the only way they interact with the rest of 
the system is through their roots. The root of an underlying graph has to be a free 
vertex, but once it is chosen, we do not care whether vertices which we add belong 
to existing polymers or not, nor do we care whether they are at a distance larger 
than L from the root. However, we will never visit such vertices in the process of 
growing a polymer. 

We have already stressed out that the underlying graph and the growth of the 
polymer are independent concepts. This has made possible to define and study 
them separately. But since they are independent, nothing forbids to generate the 
underlying graph while growing the polymer. This way, we are sure to only generate 
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the parts of the underlying graph that we visit during the growth procedure. So 
the idea is just to grow the polymer as usual, and to complete the underlying graph 
when the current endpoint of the polymer has not been assigned out-edges yet. 

If d(v) is the out-degree of vertex v, the entangled RG* algorithm becomes as 
follows: 



1. Initialization: 

(a) Pick an unoccupied vertex r uniformly at random 

(b) Set T = {r} and V = E = 

(c) Set C= (r),Di = _0 and L max = 1 

2. At each step, with C — (v\, . . . , Vi) : 

(a) If i = L, stop, C is a complete polymer 

(b) If Vi has not been assigned out-edges, i.e. Vi&T: 

i. Set d(vi) — k with probability tt k 
ii. Choose uniformly at random d(vi) different vertices among 
the Q neighbors of Vi in Q 
iii. For each vertex v chosen: 

(A) Add the edge v; t — > v to E 

(B) If v £ T and v (£ E, add v to T 
iv. Remove Vi from T and add it to V 

(c) If \Di\ = d(vi) , recoil: 

i. If i > max(l, L maK — £) , set C = {v\, . . . , i>i_i) and go to 2 
ii. Otherwise, stop, the generation has failed 

(d) Else keep picking uniformly at random a vertex v neighbor of Vi 

in G and not in Di , add it to Di , and stop when either v is unoccupied 
or \D l \ = d(vi): 

i. If v is unoccupied 

(A) Set C = (vi, . . . ,Vi,v),D i+ i = and update L max 

(B) If v (£ T and v E, add v to T 

(C) Go to 2 

ii. Else, \Di\ = d(vi) , and recoil as specified in 2.(c) 



At the end of this procedure, it can happen that the underlying graph is not 
complete: some vertices can have not been assigned out-edges. It is then important 
to keep the current partial underlying graph in memory, because this graph will 
be further completed if the next step of the algorithm is to compute the weights. 
Indeed, the same idea applies in this case: when we compute the weights, we do as 
usual, and we complete the current partial underlying graph when required. 
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